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1.   INTRODUCTION 

A  project  to  provide  automated  instruction  in 
computer  science,  ACSES  (Automatic  Computer  Science  Education 
System) [8]  ,  is  in  progress  at  the  University  of  Illinois 
at  Urbana-Champaign  using  the  PLATO  computer-based  education 
system.   An  important  part  of  ACSES  is  a  set  of  interactive 
compilers  which  allow  students  learning  one  of  several  pro- 
gramming languages  to  write  and  execute  small  programs  through 
their  PLATO  terminal.   This  thesis  describes  the  design 
and  implementation  of  the  COBOL  compiler  system  in  ACSES. 

1.1   The  PLATO  system 

PLATO  is  a  computer-based  education  system  which  has 
evolved  since  1959  [7] .   The  current  system,  PLATO  IV,  in- 
cludes a  CDC  CYBER  73  computer  linked  with  over  700  terminals 
in  more  than  100  locations.  A  hierarchy  of  memories  (fast 
core  memory,  extended  core  storage  and  disks)  contain  lesson 
material  for  a  wide  variety  of  academic  areas.   The  PLATO  IV 
student  terminal  has  a  keyboard  input  device  that  includes 
regular  typewriter  keys  and  special  function  keys.   The  ter- 
minal screen  is  an  8  1/2"  square  plasma  display  panel,  pro- 
viding storage  of  information  on  the  screen  itself  without 
flicker  and  without  being  constantly  refreshed  [9] .   A 
powerful  language,  TUTOR,  is  used  to  write  the  instructional 
programs,  called  "lessons." 


1.2   General  structure 

The  COBOL  compiler  system  consists  of  three  TUTOR 
lessons.   The  COBOL  editor  accepts  the  program  entered  by 
the  student  using  the  PLATO  keyboard.   The  COBOL  interpreter 
executes  the  student's  COBOL  program.   A  third  lesson  pro- 
vides information  needed  to  successfully  use  the  COBOL  system, 
such  as  the  subset  of  COBOL  implemented. 

Since  it  is  a  part  of  ACSES ,  the  COBOL  system  is 
designed  for  use  by  students  beginning  their  study  of  COBOL. 
Therefore  a  primary  goal  has  been  to  provide  the  student  with 
good  diagnostic  messages,  issued  as  soon  as  possible,  both  in 
the  editor  and  the  interpreter.   The  student  can  write  a  short 
COBOL  program  and  execute  it  within  a  few  minutes. 

Only  a  relatively  small  subset  of  COBOL  has  been  in- 
plemented.  This  subset,  described  in  chapter  two,  is  designed 
to  be  adequate  for  beginning  COBOL  programmers. 

A  key  feature  of  the  COBOL  system  is  that  it  uses 
a  table-driven  compiler-writing  system  developed  by  Dr.  Thomas 
Wilcox  and  Michael  Tindall  [13] .   PL/I  has  already  been  imple- 
mented using  this  system,  while  FORTRAN  IV  and  BASIC  are  cur- 
rently being  developed.   The  use  of  this  language-independent 
system  greatly  reduces  the  effort  required  to  develop  an  editor 
for  a  specific  language,  since  the  language  implementor  only 
needs  to  add  the  specifics  of  a  language  to  a  skeleton  editor. 


This  consists  primarily  of  specifying  table  entries  and 
writing  a  description  of  the  language's  syntax.   However,  the 
COBOL  interpreter,  while  using  some  parts  of  the  compiler- 
writing  system,  is  largely  language-dependent. 


2.   THE  COBOL  SUBSET 

Full  COBOL  is  an  extensive  programming  language 
designed  for  data  processing   applications.   To  implement  the 
entire  COBOL  language  would  greatly  exceed  the  table  sizes 
of  the  language-independent  editor  routines.   However  a 
relatively  small  subset  of  COBOL,  given  in  Appendix  A,  is 
sufficient  to  satisfy  the  educational  goals  of  ACSES  with 
respect  to  COBOL.   The  overall  criterion  for  selecting  the 
subset  has  been  to  include  all  the  features  essential  to 
writing  elementary  COBOL  programs. 

The  American  National  Standards  Institute  (ANSI) , 
formerly  known  as  the  United  States  of  America  Standards  In- 
stitute (USASI) ,  is  responsible  for  maintaining  standards  for 
COBOL.   It  has  defined  a  minimum  subset  of  COBOL  which  manu- 
facturers are  expected  to  include  in  their  COBOL  compilers 
[10].   This  minimum  ANSI  subset  is  comparable  to  the  subset 
described  here.   However  several  features  have  been  deleted  and 
a  few  added,  for  reasons  given  in  the  following  paragraphs. 

Any  COBOL  compiler  must  support  the  basic  data  pro- 
cessing requirements.   These  include  reading  and  writing  data 
organized  in  records  and  files,  moving  and  comparing  data, 
doing  simple  arithmetic  calculations,  and  controlling  the  flow 
of  program  execution.   The  COBOL  statements  OPEN,  READ,  WRITE , 


CLOSE,  MOVE,  IF,  ADD,  SUBTRACT,  MULTIPLY,  DIVIDE,  GO, 
PERFORM,  EXIT  and  STOP  form  the  minimum  set  that  satisfies 
these  requirements.   All  are  included  in  the  ANSI  subset  and 
all  except  DIVIDE  are  in  the  PLATO  subset. 

Compactness  is  a  desirable  quality  of  computer  pro- 
grams [11]  which  COBOL  programs  seldom  achieve  due  to  the 
requirements  of  COBOL  syntax.   In  order  to  achieve  some 
measure  of  compactness,  the  COMPUTE  verb,  not  included  in  the 
ANSI  subset,  has  been  included  in  the  PLATO  subset.   Without 
the  COMPUTE  verb  a  COBOL  programmer  must  use  several  arithmetic 
statements  and  must  define  intermediate  results  in  order  to 
evaluate  an  arithmetic  expression.  The  DIVIDE  verb  has  been 
excluded,  since  it  is  infrequently  used  and  COMPUTE  includes 
its  functions. 

Other  extensions  to  the  ANSI  subset  which  improve 
the  compactness  and  locality  of  COBOL  are  nested  IF  statements, 
some  forms  of  compound  conditions,  and  arithmetic  expressions 
in  conditions. 

Another  influence  in  the  definition  of  the  COBOL  sub- 
set was  Armstrong  [1] ,  who  points  out  the  problems  with  the 
ALTER  verb  and  with  complex  conditions.   ALTER  often  hides  pro- 
gram structure  by  causing  a  GO  to  branch  to  a  paragraph  other 
than  that  named.   Instead  the  DEPENDING  ON  option  of  the  GO  TO 
has  been  included.   Beginning  COBOL  programmers  often  are 
confused  by  the  rather  complex  rules  governing  precedence  of 


logical  operators  and  governing  implied  operators  and  subjects 
of  relations.   To  prevent  these  problems,  implied  operators 
and  subjects  are  not  allowed,  the  logical  connectives  AND  and 
OR  may  not  be  used  in  the  same  condition,  and  NOT  is  restricted 
to  use  as  a  part  of  a  relation  or  class  test. 

The  language-independent  editor  routines  are  de- 
signed for  programs  not  larger  than  2  8  lines  of  64  characters 
each  (the  space  available  on  the  PLATO  screen) .   Since  COBOL 
requires  many  lines  of  code  for  even  a  trivial  program, 
modifications  to  COBOL  were  required  to  reduce  program  sizes. 
The  first  two  of  COBOL* s  four  divisions,  the  IDENTIFICATION 
and  ENVIRONMENT  divisions,  have  been  excluded,  as  well  as 
the  two  headers  usually  beginning  the  DATA  division.   This 
is  not  a  serious  loss,  since  most  of  the  excluded  portions 
are  fixed  division  and  section  headers  which  the  student 
can  easily  learn  elsewhere. 

The  final  criterion  for  inclusion  in  the  PLATO 
subset  was  space  availability.   Each  table  in  the  editor 
has  a  maximum  size;  the  subset  implemented  uses  nearly  the 
maximum  number  of  predefined  names  and  entries  in  the  syntax 
code  table.   For  this  reason  some  lesser  used  features  that 
are  part  of  the  ANSI  standard  subset  could  not  be  included, 
such  as  the  EXAMINE  verb  and  index  data-items. 


3.   THE  PROGRAM  EDITOR 

3.1  As  seen  by  the  student 

As  the  student  types  his  program  using  the  PLATO 
keyboard,  it  is  displayed  on  the  PLATO  screen.   At  the  same 
time,  it  is  examined  by  the  lexical,  syntactic  and 

semantic  routines  of  the  editor.   Any  errors  detected  are 
immediately  called  to  the  student's  attention  by  drawing  a 
rectangle  around  the  character  or  word  just  entered.   If  the 
student  then  presses  the  HELP  key,  an  error  message  is  dis- 
played explaining  the  error  or  listing  legal  alternatives. 
If  the  student  again  presses  HELP  and  the  error  is  syntactic 
or  semantic,  a  message  giving  the  syntactic  type  of  the  last 
token  entered  is  written. 

By  using  the  special  black  keys  of  the  PLATO  key- 
board, the  student  can  position  the  cursor  at  any  point  in  his 
program,  add  or  delete  characters,  and  indicate  when  he  has 
completed  his  program. 

3.2  The  tables  of  the  compiler-writing  system 

The  editing  features  described  in  Section  3.1  are 
features  of  the  compiler-writing  system  incorporated  into  the 
COBOL  editor.   A  set  of  tables,  developed  and  maintained  by 


auxiliary  programs,  describes  the  internal  character  codes, 
key  function  tables,  default  keyboard  tab  stops,  lexical 
class  assignments,  lexical  node  table,  "safe"  nodes,  pre- 
defined symbols,  field  symbols,  new  symbol  allocation 
classes  and  syntax  tables.   Details  of  each  of  these  tables 
as  used  in  the  COBOL  editor  are  given  in  the  following 
sections. 

In  addition  to  the  tables ,  semantic  routines  and 
error  messages  are  added  to  the  editor  by  the  language  imple- 
mentor. 

Although  initially  developed  for  free  format 
languages,  the  compiler-writing  system  has  been  modified  to 
accomodate  features  of  fixed  format  languages,  such  as  the 
continuation  of  words  between  lines  in  COBOL.   In  general 
the  system  works  well  with  COBOL,  but  there  are  problems 
with  lexical  analysis,  discussed  in  Section  3.5.   Semantic 
routines,  written  in  TUTOR,  must  be  used  extensively,  par- 
ticularly in  parsing  data  structures  and  in  examining 
pictures,  to  do  verifications  which  are  difficult  or  im- 
possible through  the  tables. 

3.3   Internal  character  codes 

There  are  63  characters  available,  as  given  in 
Appendix  B.l.l.   These  are  the  normally  printable  characters 
from  the  EBCDIC  character  codes  used  in  the  IBM  360/370  and 


other  computers.   This  set  was  chosen  to  avoid  confusion  for 
the  students  using  PLATO  who  are  also  using  the  IBM  computers 
available  on  campus  and  elsewhere.   Although  the  bit  con- 
figurations are  different,  the  characters  have  the  same 
ordering  as  in  EBCDIC. 

In  the  character  tables  seven  bits  are  available 
for  each  character.   The  extra  seventh  bit  is  used  in  the  in- 
terpreter as  a  sign  bit  of  a  signed  decimal  field.   If  the 
high-order  bit  of  the  most  significant  digit  position  is  on, 
the  field  is  negative. 

There  is  a  sixty-fourth  character,  made  of  three 
question  marks,  which  the  student  cannot  enter.   At  the  be- 
ginning of  execution  the  interpreter  sets  all  character  posi- 
tions to  this  value.   Then  the  arithmetic  routines  check  for 
it  to  determine  whether  a  field  has  ever  been  assigned  a  value 

3.4   Columnar  format  and  tab  settings 

Standard  COBOL  has  a  fixed  column  format  based  on 
an  80  column  card,  but  the  PLATO  screen  is  only  64  characters 
wide.   To  make  the  most  efficient  use  of  the  available  space, 
those  fields  used  for  documentation  only,  usually  columns  one 
to  six  and  73  to  80,  are  not  available.   In  the  resulting 
format  column  one  may  contain  a  hyphen  to  indicate  continu- 
ation, an  asterisk  to  indicate  a  line  of  comments,  or  a  blank. 
Columns  two  through  five  are  area  A,  where  division  and 
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section  headers,  paragraph  names  and  some  level  numbers  must 
begin.   Area  B  is  columns  six  through  64. 

During  editing  tab  stops  appear  near  the  top  of  the 
screen.   The  student  may  press  TAB  to  have  blanks  inserted  up 
to  the  next  tab  stop.   The  initial  settings  of  the  tab  stops 
are  in  a  table,  but  may  be  changed  by  the  student. 

The  first  tab  stop  setting  in  COBOL  is  at  column  six, 
so  that  by  pressing  the  TAB  key  at  the  start  of  a  new  line  the 
student  can  quickly  get  to  area  B.   Additional  tab  stops  pro- 
vide for  easy  identification  or  organization  of  clauses  in 
columns,  as  given  in  Appendix  B.1.2. 

3.5   Lexical  tables 

The  lexical  tables  include  the  lexical  class 
assignment  table,  the  lexical  node  table  and  the  safe  node 
list.   Using  these  tables  the  language-independent  editor  per- 
forms the  lexical  analysis  of  the  COBOL  program,  consisting 
of  scanning  the  input  and  constructing  the  tokens  of  the  lan- 
guage.  COBOL  tokens  may  be  words,  punctuation,  arithmetic 
operators,  relational  operators  and  picture  strings. 

The  model  for  lexical  analysis  is  a  finite  state 
machine.   Lexical  analysis  of  a  particular  token  starts  in 
state  zero,  alias  node  zero,  then  node  transitions  are  made 
according  to  current  node  and  character  class  of  the  next  in- 
put, as  established  by  the  lexical  class  assignment  table. 
The  column  on  the  screen  in  which  the  character  appears  is  also 
assigned  a  character  class  so  that  lexical  decisions  may  be 
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based  on  the  screen  location  of  a  character.   Table  entries 
representing  illegal  inputs  at  a  given  node  contain  error 
message  numbers.   When  a  legal  token  is  recognized,  the  table 
entry  indicates  whether  the  token  is  extendable  (an  extendable 
token  is  one  whose  termination  is  determined  only  by  a 
delimeter  not  included  in  the  token) .   There  are  also  pro- 
visions to  generate  field  tokens  indicating  the  presence  of 
a  token  in  a  particular  range  of  columns,  to  allow  for  tokens 
to  be  continued  between  lines  and  to  call  a  subroutine  of 
nodes. 

COBOL  presented  several  problems  in  adopting  it  to 
use  the  lexical  tables.   Largest  of  these  was  that  the  lexical 
characteristics  of  picture  strings  are  vastly  different  from 
that  of  all  other  COBOL  tokens,  requiring  a  far  greater 
number  of  node  table  entries  than  are  available  in  a  single 
node  table.   Minor  modifications  were  made  to  the  editor 
which  allow  for  a  second  set  of  lexical  tables  for  picture 
strings  which  are  loaded  in  place  of  the  regular  set  when 
needed.   Both  sets  of  lexical  tables  are  discussed  in  the 
following  paragraphs. 

The  first  set  of  lexical  tables  is  used  for  all 
tokens  except  picture  strings.   The  lexical  class  assign- 
ments, given  in  Appendix  B.1.4,  partition  the  characters 
and  columns  into  classes;  all  members  of  a  class  are 
equivalent  in  lexical  analysis.   The  columns  are  partitioned 
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into  three  character  classes:   column  one,  two  through  five 
and  six  through  64  representing  the  continuation  column, 
area  A  and  area  B,  respectively.   The  characters  P,  I  and 
C  each  belong  to  a  character  class  consisting  only  of  that 
character,  so  that  PIC  followed  by  a  blank  may  be  recognized 
lexically  and  used  as  described  later. 

Safe  nodes  are  those  which  can  handle  the  start 
of  a  new  line.   If  the  lexical  routines  are  not  in  a  safe 
node  when  the  student  presses  CR  (carriage  return) ,  blanks 
are  inserted  on  the  current  line  until  a  safe  node  is  reached. 
In  the  COBOL  tables  the  safe  nodes,  listed  in  Appendix  B.1.6, 
include  the  initial  node,  the  node  for  a  string  of  blanks, 
the  node  for  a  comment  line  and  all  nodes  representing  tokens 
which  may  be  continued  on  the  next  line.   An  example  of  the 
latter  is  node  one;  it  represents  a  sequence  of  characters 
which  forms  a  COBOL  word.   The  word  may  be  terminated  by  a 
delimeter  or  may  be  continued  on  the  current  line  or  in 
area  B  of  the  following  line. 

The  first  lexical  node  table  uses  every  feature 
available  to   it  in  the  compiler-writing  system.   Most  of  the 
tokens  recognized  are  COBOL  words  consisting  of  letters,  digits 
and  hyphens.   They  are  always  extendable  tokens  because  they 
are  terminated  by  a  delimeter,  and  they  may  be  continued  on 
a  second  line.   Numeric  constants  and  character  string  constants 
may  also  be  continued.   A  numeric  constant  is  an  extendable 
token.   Although  a  character  string  constant  is  terminated 
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by  a  quote  which  is  part  of  the  token,  it  is  treated  as  an 
extendable  token  in  order  to  enforce  the  COBOL  rule  requiring 
a  delimeter  following  it. 

A  problem  that  became  apparent  during  program  test- 
ing was  partially  solved  by  modifying  the  lexical  tables.   It 
is  not  difficult  to  write  a  COBOL  program  which  fits  on  the 
PLATO  screen  and  could  be  successfully  entered  by  the  student 
if  there  were  only  more  entries  in  the  intermediate  text.   The 
source  of  the  problem  is  COBOL 's  verboseness  and  in  particular 
its  rule  that  blanks  separate  most  tokens,  so  that  about  one- 
half  of  the  intermediate  text  entries  are  tokens  of  one  or 
more  blanks.   To  alleviate  this  problem  the  required  blank  is 
included  as  part  of  the  preceding  token  whenever  possible. 
This  can  be  reasonably  accomplished  only  for  tokens  which  must 
always  be  delimited  by  spaces;  this  includes  the  arithmetic 
operators:   +,  -,  *,  /;  the  relational  operators:  <,  =,  >; 
the  parentheses  and  the  period.   The  lexical  node  table 
entries  for  these  tokens  define  two  character  nonextendable 
tokens  including  the  special  character  and  a  blank. 

A  serious  problem  is  caused  by  the  dual  role  of  the 
period  in  COBOL  as  both  the  decimal  point  in  numeric  constants 
and  as  the  punctuation  terminating  a  sentence.   A  series  of 
digits  followed  by  a  period  and  a  blank  is  interpreted  in 
COBOL  as  an  integer  followed  by  a  punctuation  symbol.   The 
problem  arises  in  the  lexical  node  table  if  the  characters 
scanned  so  far  are  one  or  more  digits  followed  by  a  period. 
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It  is  necessary  to  inspect  the  next  character  to  determine 
the  role  of  the  period.   If  the  next  character  is  a  space, 
the  period  was  punctuation;  if  it's  a  digit,  the  period  was 
a  decimal  point.   In  the  former  case  the  lexical  node  table 
entry  recognizes  an  extendable  token  consisting  of  an  integer 
and  a  period.   This  peculiar   token,  logically  two  tokens,  is 
assigned  to  a  unique  syntactic  class.   Some  additional  code 
is  required  throughout  both  the  editor  and  interpreter  for 
this  class. 

An  unresolved  problem  caused  by  the  dual  role  of 
the  period  is  that  two  violations  of  COBOL  punctuation  rules 
cannot  be  detected.   The  lexical  node  tables  will  accept  a 
period  illegally  surrounded  by  blanks,  and  a  word  followed  by 
a  period  followed  by  an  integer  without  any  required  inter- 
vening blanks.   Because  the  lexical  node  table  has  no  informa- 
tion regarding  the  previous  token,  it  has  no  way  to  tell  the 
role  of  a  period  that  begins  a  token  and  must  assume  that  it 
"could  be  either  punctuation  or  a  decimal  point,  while  only  one 
of  these  can  ever  be  legal  at  a  given  point. 

Continuation  of  words,  character  strings  and  numeric 
constants  onto  the  following  line  is  accomplished  using  a  sub- 
routine within  the  lexical  node  table.   This  subroutine  uses 
the  continuation  features  to  form  an  effective  token  from  the 
parts  on  the  two  lines.   It  is  quite  satisfactory,  completely 
complying  with  COBOL  continuation  rules. 

Two  field  tokens  are  generated  by  the  regular  node 
table.   One  indicates  the  presence  of  a  word  or  number  be- 
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ginning  in  area  A  (columns  two  through  five)  so  that  the  syn- 
tax routines  can  verify  that  tokens  are  placed  in  the  proper 
area.   The  second  field  token  is  generated  if  a  token  begins 
with  the  characters  PIC  and  blank.   This  field  token  triggers 
the  loading  of  the  second  set  of  lexical  tables  to  scan  the 
picture  string  which  must  follow.   A  field  token  is  associated 
with  an  entry  in  the  name  table  but  the  character  string  as- 
sociated with  the  name  table  entry  is  never  printed,  because 
the  field  token  exists  only  as  a  means  of  passing  information. 

In  the  other  set  of  lexical  tables,  used  to  scan 
picture  strings,  the  finite-state  machine  model  was  less  than 
ideal  but  the  compiler-writing  system  allowed  enough  flexibili- 
ty to  successfully  handle  the  picture  strings. 

Picture  strings  in  COBOL  may  be  rather  complicated. 
In  order  to  perform  editing  of  numeric  values  to  be  printed 
a  picture  string  such  as  -$  ( 3)  , $$9VB99  might  be  used.   The 
explicit  representation  of  picture  strings  included  in  Appendix 
E.l  is  quite  long,  requiring  the  entire  node  table  (32  nodes 
and  20  character  classes)  to  implement.   As  a  result  some  legal 
picture  strings,  such  as  those  using  the  symbols  A,  CR,  DB , 
P  and  +,  are  not  included  in  the  subset. 

This  size  is  a  result  of  the  inefficiency  of  the 
finite-state  model  when  applied  to  picture  strings.   The  many 
legal  combinations  of  picture  type  (alphanumeric,  numeric 
edited  and  numeric) ,  floating  or  replacement  character  (-  $  * 
Z  or  none) ,  whether  a  decimal  point  has  been  seen  yet,  and 
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whether  an  S  was  present  can  only  be  represented  by  a  different 
state  (node  in  the  table)  for  each  combination. 

When  a  legal  picture  string  has  been  recognized, 
the  class  of  the  picture  is  returned  to  the  syntax  routines. 
Bits  are  set  within  the  class  number  to  indicate  the  picture 
type,  whether  the  picture  describes  a  noninteger  numeric  item 
and  whether  the  picture  indicates  that  the  item  should  be  com- 
pletely blank  when  it  has  a  zero  value. 

However  more  information  is  needed  from  a  picture 
string.   In  order  to  verify  that  the  lengths  of  different 
records  in  the  same  file  and  of  redefined  items  are  equal  the 
syntax  routines  must  know  the  lengths  specified  by  picture 
strings.   a  semantic  routine  reexamines  the  picture  strings 
lexically,  determining  the  lengths  as  well  as  other  inform- 
ation needed  later  by  the  interpreter. 

An  alternate  approach  to  picture  strings  is  to 
treat  each  character  in  the  picture  string  as  a  token,  putting 
the  burden  of  the  verification  of  the  picture  strings  on  the 
syntactic  routines.   However,  there  are  several  reasons  why 
this  was  rejected:   It  would  require  a  large  number  of 
statements  in  the  syntax  program  causing  an  overflow  of  the 
syntax  node  table;  it  would  use  significantly  more  entries 
in  the  name  table,  in  the  symbol  table  and  in  the  intermediate 
text;  and  it  would  reduce  the  efficiency  of  the  interpreter 
because  the  characters  of  the  picture  string  would  no  longer 
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necessarily  be  in  consecutive  character  table  locations  (a 
fact  used  by  the  editing  routines) . 

3.6  Key  function  tables 

The  key  function  table  assigns  an  action  to  each 
possible  student  keypress.   Usually  the  regular  typewriter  keys 
cause  a  character  to  be  added  to  the  program,  while  the  black 
function  keys  invoke  an  editing  operation  such  as  erasing  a 
word  or  backing  up  a  line. 

The  key  function  table  in  the  regular  set  of  lexical 
tables  is  shown  in  Appendix  B.1.3  along  with  an  explanation  of 

the  special  function  codes.   A  second  key  function  table  is 
in  the  set  of  lexical  tables  for  picture  strings  and  is  iden- 
tical to  the  first  table  except  for  a  few  of  the  editing  keys. 
Because  the  lexical  routines  of  the  compiler-writing  system 
back  up  to  the  start  of  a  token  if  the  student  presses  any 
key  that  backs  up  the  editor,  and  the  token  containing  the 
picture  string  includes  PIC  which  must  be  scanned  by  the  reg- 
ular tables,  the  editing  keys  which  cause  the  editor  to  back 
up  are  assigned  the  function  of  loading  the  regular  tables. 
Then  the  regular  tables  are  used  to  determine  the  actual  edit- 
ing function  to  be  performed. 

3.7  Predefined  symbols 

The  editor  handles  the  tokens  of  a  language  using 
a  name  table,  symbol  table,  character  table  and  hash  table. 
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Each  different  character  string  recognized  as  a  token  by 
lexical  analysis  has  an  entry  in  the  name  table.   More  than 
one  name  may  be  linked  to  the  same  symbol  table  entry, 
such  as  COMPUTATIONAL  and  its  abbreviation  COMP  in  COBOL. 
Also  one  name  table  entry  may  be  linked  to  several  symbol 
table  entries.   This  is  used  in  COBOL  for  the  name  FILLER 
which  may  be  used  for  different  unreferenced  parts  of  data 
structures . 

All  COBOL  symbols  which  have  meaning  without 
declaration  by  the  programmer  are  included  in  the  predefined 
portion  of  the  symbol  table.   This  includes  keywords  and 
optional  words  of  the  PLATO  subset,  arithmetic  and  relational 
operators,  punctuation  and  field  symbols. 

A  few  name  table  entries  were  not  used  by  the 
PLATO  subset.   In  the  vacant  entries  are  common  COBOL  keywords 
not  included  in  the  PLATO  subset,  so  that  an  appropriate  error 
message  can  be  displayed  if  the  student  tries  to  use  one  of 
these  words. 

The  compiler-writing  system  includes  a  table  to 
assign  each  symbol  to  a  procedure  block.  This  feature  is 
designed  for  block  structured  languages  such  as  PL/I,  but 
since  COBOL  has  no  block  structure  this  table  is  not  used 
in  the  COBOL  editor. 
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3.8   Syntax  description 

The  syntax  of  the  COBOL  subset  is  given  in  a  syntax 
description  language  designed  as  part  of  the  compiler-writing 
system.  This  language  is  translated  by  an  auxiliary  program 
into  a  table  of  12-bit  entries  to  be  interpreted  by  the  editor 
as  the  student  enters  his  program.  Features  of  this  language 
include  comparisons  based  on  the  token  just  entered,  updating 
symbol  table  entries,  recursive  calls  to  procedures,  specify- 
ing error  message  numbers,  and  calls  to  semantic  routines. 

The  COBOL  subset  syntax  language  program  is  included 
in  Appendix  E.2  with  explanations  of  the  functions  of  each 
procedure  and  many  statements.   Only  a  broad  view  of  the 
structure  of  the  syntax  program  is  presented  here. 

There  is  a  sharp  division  in  the  syntax  program 
corresponding  to  the  two  divisions  of  the  COBOL  subset,  the 
DATA  and  PROCEDURE  divisions.   Except  for  the  initial  syntax 
program  (that  portion  preceeding  the  first  procedure)  the 
syntax  of  the  two  divisions  is  totally  separate. 

The  DATA  DIVISION  portion  consists  largely  of  two 
procedures  "fdproc"  and  "parsitem" .   "Fdproc"  includes  the 
syntax  of  the  clauses  of  the  FD  statement,  and  calls  "parsitem" 
to  parse  the  record  descriptions  of  the  FD .   A  particular 
invocation  of  "parsitem"  examines  all  the  items  directly  sub- 
ordinate to  the  same  item  in  the  data  structure,  calling  it- 
self recursively  for  any  further  sub-items. 
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COBOL  posed  several  problems  in  these  two  pro- 
cedures.  Because  the  clauses  of  an  FD  or  a  data  item  descrip- 
tion may  appear  only  once  and  in  any  order,  it  would  have  re- 
quired excessive  space  to  have  separate  logic  paths  for  each 
possibility.   Instead  at  the  start  of  the  code  for  each  clause 
a  check  of  either  a  symbol  table  entry  or  a  variable  is  made 
to  insure  that  the  clause  has  not  already  occurred  in  the 
current  FD  or  data  item  description.   Because  many  combinations 
of  data  item  clauses  are  invalid  there  are  many  checks  for  il- 
legal combinations.   Because  computations  are  not  available 
in  the  syntax  language  but  are  needed  to  determine  the  lengths 
and  storage  locations  of  files  and  data  items,  several  calls 
to  semantic  routines  written  in  TUTOR  are  used.   More  semantic 
routines  are  used  to  access  variables  and  routines  of  the 
editor  not  otherwise  available  to  the  syntax  program  and  to 
check  types  in  relations  and  VALUE  clauses. 

The  syntax  of  the  PROCEDURE  DIVISION  presented  fewer 
problems.   The  primary  procedure,  "statelst",  contains  the 
basic  syntax  for  each  COBOL  verb.   It  is  called  recursively 
to  parse  the  list  of  statements  which  occur  in  the  true  and 
false  branches  of  IF  statements  and  in  the  AT  END  branch  of 
the  READ  statement.   There  are  procedures  to  parse  conditions, 
relations  and  expressions,  and  to  look  for  a  particular  oper- 
and type  such  as  a  numeric  data  item.   Substantial  type  check- 
ing is  required  in  MOVE  statements  and  in  relations. 
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4.   THE  EXECUTION  INTERPRETER 

After  the  student  has  entered  his  program  he  goes 
to  the  COBOL  interpreter  lesson  to  execute  his  program.   The 
interpreter  acts  on  an  intermediate  text  representation  of 
the  COBOL  source  program,  essentially  the  program  represented 
as  a  series  of  tokens.   The  interpreter  carries  out  the  in- 
structions in  the  PROCEDURE  DIVISION  by  invoking  appropriate 
TUTOR  routines. 

The  most  difficult  problem  in  designing  the 
COBOL  interpreter  was  the  lack  of  commands  in  TUTOR  to  direct- 
ly handle  binary  coded  decimal  data.   The  CDC  CYBER  73,  and 
therefore  also  TUTOR,  is  organized  around  a  60  bit  word 
containing  a  floating-point  number,  a  fixed-point  integer 
or  10  six  bit  characters.   However  in  order  to  use  charac- 
ters in  computation  they  must  first  be  converted  into  fixed- 
point  or  floating-point  form.   COBOL  uses  binary  coded  deci- 
mal fields  extensively  in  input  and  output  files,  and  allows 
the  programmer  to  do  arithmetic  using  these  fields. 

The  solution  found  to  this  problem  was  inspired  by 
Wilcox's  division  of  code  generation  into  two  phases,  trans- 
lation and  coding  [12] .   The  interpreter  is  organized  in  two 
large  parts,  which  will  be  called  the  "supervisor"  and  the 
"machine."   The  supervisor  includes  the  translation  of  expres- 
sions to  a  postfix  representation,  the  scanning  of  the 
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intermediate  text,  all  interaction  with  the  student  user,  and 
subroutine  calls  to  the  machine.   The  parameters  passed  to  the 
machine  represent  an  operation  code  and  addresses  in  the  instru- 
tion  set  of  an  imaginary  computer  designed  to  execute  COBOL. 
This  is  essentially  Wilcox's  concept  of  a  "source  language 
machine."   The  machine  then  interprets  this  instruction  set. 

As  the  interpreter  developed,  the  division  of  func- 
tions between  the  parts  became  less  distinct.   The  super- 
visor is  in  control  of  input-output  and  branching,  using 
parts  of  the  machine  as  subroutines.   But  the  division  into 
two  parts  remains  as  the  fundamental  organization  of  the 
interpreter. 

4.1  Primary  data  structures 

Tying  the  two  parts  of  the  interpreter  together  is 
a  common  data  structure,  the  operand  list.   Entries  in  the 
list  are  initialized  by  the  supervisor  and  modified  by  the 
machine.   An  entry  in  the  list  contains  all  information 
needed  by  the  machine  about  a  particular  operand.   This  may 
include  the  data  type  of  the  operand,  its  length  in  charac- 
ters, its  number  of  decimal  places,  its  storage  location, 
whether  it  is  a  signed  field,  its  binary  value,  whether  its 
binary  value  has  been  determined,  the  symbol  table  pointer 
to  the  operand  and  the  symbol  table  pointer  of  the  operand's 
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picture.   The  address  of  a  data  item  within  the  machine  is 
represented  by  a  pointer  to  the  operand  list. 

Another  important  data  structure  is  the  symbol 
table  constructed  by  the  COBOL  editor.   It  contains  static 
information  about  each  symbol  in  the  source  program,  such  as 
its  length,  location  and  relationship  to  other  items  in  a 
data  structure.   Details  of  how  the  symbol  table  entries  are 
used  for  various  symbol  classes  are  included  in  Appendix  E.I.; 
this  information  is  needed  because  the  names  of  the  symbol 
table  entries  do  not  necessarily  reflect  their  usage  in 
COBOL.   Anyone  examining  the  lessons  in  the  future  will  have 
to  refer  to  documentation  to  understand  many  of  the  statements 
involving  symbol  table  entries. 

In  two  cases  symbol  table  fields  are  not  static 
but  are  used  by  the  supervisor  to  store  information.   Fields 
in  the  symbol  table  entry  for  a  file  are  used  to  indicate 
the  file  status  and  to  hold  the  offset  of  the  current  record 
from  the  first  character  of  the  block.   While  executing  a 
PERFORM  the  symbol  table  entry  of  the  last  paragraph  of  the 
PERFORM  range  is  modified  to  indicate  that  an  exit  is  needed. 

Discussion  of  other  data  structures  is  included 
with  the  discussion  of  the  related  statement. 
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4.2   The  supervisor 

4.2.1   Interaction  with  the  student 

To  a  large  degree  the  COBOL  interpreter  runs  with- 
out student  intervention.   The  student  is  involved  in  three 
ways:   choosing  the  execution  option,  supplying  input  and 
responding  to  error  messages. 

The  student  may  choose  between  three  methods  of 
execution.   One  provides  a  complete  trace  of  the  program, 
with  all  changes  in  variables  displayed  and  a  box  drawn 
around  each  paragraph  name  as  it  is  entered  and  around  each 
verb  as  it  is  executed.   A  second  method  includes  the  state- 
ment trace  but  not  the  variable  trace.   The  third  has  no 
tracing;  the  only  information  the  student  receives  about  the 
progress  of  the  program  is  if  a  READ,  WRITE  or  STOP  is  exe- 
cuted or  if  an  error  occurs.   The  full  trace  method  will 
probably  be  most  often  used,  since  with  it  a  student  can 
see  exactly  what  his  COBOL  statements  are  doing. 

The  supervisor  asks  the  student  to  enter  input  data 
whenever  a  new  block  of  data  is  needed  by  a  READ  statement. 
Since  blocked  input  files  are  included  in  the  subset,  not 
every  execution  of  a  READ  causes  an  input  request.   If  the 
student  indicates  that  there  is  no  more  data,  the  AT  END 
branch  of  the  READ  is  executed.   If  he  has  data  to  enter, 
the  supervisor  asks  for  input  for  each  elementary  item  in 


25 


each  record  of  the  block.   To  do  this  the  symbol  table  in- 
cludes pointers  to  the  "son,"  "brother"  and  "father"  of  each 
item  in  the  data  structure.   This  structure  is  taken  from 
Knuth  [5]  as  is  the  algorithm  for  traversing  the  structure 
to  reach  every  elementary  item. 

The  WRITE  statement  does  not  require  any  student 
intervention  but  it  is  discussed  here  because  it  is  related 
to  the  READ.   Because  COBOL  is  often  used  to  generate  reports, 
it  is  beneficial  to  be  able  to  see  output  as  it  nor- 
mally appears  on  a  printed  output  rather  than  in  the  few 
lines  available  on  the  same  screen  display  as  the  COBOL  pro- 
gram.  Therefore,  the  number  of  records  per  block  of  an 
output  file  is  defined  as  the  number  of  lines  per  page  to  be 
seen  on  a  screen  display  consisting  solely  of  the  student's 
output.   If  the  student  forgot  to  assign  values  to  any  parts 
of  the  output  record,  those  parts  contain  triple  question 
mark  characters  to  alert  him  to  his  oversight. 

Whenever  any  part  of  the  COBOL  interpreter  detects 
an  execution  error,  an  appropriate  error  message  is  displayed. 
The  student  then  has  the  choice  of  terminating  execution 
(hopefully  to  return  to  the  editor  to  correct  his  program) 
or  of  taking  corrective  action.   For  some  errors,  the  cor- 
rective action  is  fixed.   For  example,  the  corrective  action 
for  a  read  of  an  unopened  file  is  to  open  the  file.   If  the 
error  is  an  invalid  numeric  value,  the  student  is  asked  to 
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supply  a  correct  value  to  replace  it.   This  includes  correct- 
ing an  out-of-range  subscript  or  replacing  a  nonnuraeric 
character  in  a  numeric  field.   The  student  may  receive  help 
in  correcting  some  errors  by  pressing  the  HELP  key. 

4.2.2   Expression  evaluation 

In  the  COBOL  subset  arithmetic  expressions  may  ap- 
pear in  COMPUTE  statements  and  in  conditions  in  IF  and  PER- 
FORM statements.   The  expressions  appear  in  infix  order  in 
the  intermediate  text  but  must  be  evaluated  according  to  the 
precedence  rules  of  the  operators.   A  complication  is  that  the 
number  of  decimal  places  to  use  for  intermediate  results  can 
only  be  determined  after  the  maximum  number  of  decimal  places 
in  all  operands  is  known  (this  problem  is  discussed  further 
in  Section  4.3.1).   Therefore,  in  order  to  avoid  two  scans  of 
the  intermediate  text,  one  to  determine  the  maximum  and  another 

to  execute  the  expression,  a  postfix  representation  of 
the  expression  is  constructed  as  the  expression  is  scanned 
from  the  intermediate  text. 

The  infix  to  postfix  translation  uses  the  operator 
precedence  algorithm  given  by  Gries  [2].   The  operators  are 
the  arithmetic  operators  +,  -,  *  and  /;  the  relational  opera- 
tors <,  =  and  > ;  NOT;  the  NUMERIC  test;  and  assignment.   The 
"f"  and  "g"  precedence  values  and  the  machine  operation  codes 
corresponding  to  the  operators  are  stored  in  otherwise 
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vacant  fields  for  predefined  entries  in  the  symbol  table.  A 
stack  is  used  for  operators,  while  operands  are  added  to  the 
previously  mentioned  operand  list. 

The  postfix  representation  is  a  queue  of  operations 
to  be  performed.   Each  entry  in  the  queue  has  an  operation 
code  and  two  pointers  to  the  operand  list.   If  one  of  the 
operands  is  an  intermediate  result,  the  pointer  for  it  in 
the  operation  queue  may  be  wrong.   However,  the  machine  can 
detect  the  error  and  correct  the  pointer. 

As  each  operand  is  added  to  the  operand  list  using 
unit  "address"  the  maximum  number  of  decimal  places  in  any 
operand  is  determined.   After  the  entire  expression  has  been 
scanned  the  machine  is  called  to  execute  all  of  the  operations 
in  the  queue. 

4.2.3   The  PERFORM  statement 

Several  options  of  the  PERFORM  verb  are  included 
in  the  COBOL  subset,  requiring  a  fairly  large  statement 
driver  in  the  supervisor.   Control  of  the  PERFORM  is  in  the 
supervisor;  the  machine  is  used  only  to  evaluate  conditions 
and  determine  values. 

To  save  information  needed  about  all  PERFORMS 
currently  in  effect,  a  stack  is  used.   Each  stack  entry  in- 
cludes the  first  and  last  paragraph  of  the  PERFORM  range,  the 


28 


paragraph  containing  the  PERFORM,  the  location  of  the  PERFORM 
in  the  intermediate  text  and,  if  the  TIMES  clause  was  used, 
the  number  of  iterations  remaining. 

If  a  paragraph  is  the  last  paragraph  in  the  range 
of  a  PERFORM,  its  symbol  table  entry  contains  a  pointer  to 
the  perform  stack.   At  the  end  of  execution  of  the  para- 
graph, this  pointer  should  match  the  current  perform  stack 
pointer.   If  not  there  must  be  recursion  or  overlapping 
ranges  in  the  PERFORM  statements.   This  error  always 
causes  an  end  to  execution,  because  the  supervisor  cannot 
repair  an  incorrect  PERFORM  structure.   If  the  pointers 
match,  control  is  transferred  to  the  PERFORM  statement  to 
decide  whether  another  iteration  is  needed. 

This  scheme  implements  the  PERFORM  verb  exactly  as 
in  standard  COBOL.   Nested  PERFORMS  are  valid  if  their  ranges 
do  not  overlap  and  if  no  paragraph  is  called  recursively. 

4.2.4   The  GO  TO  statement 

The  branching  of  the  COBOL  program  within  the  inter- 
mediate text  is  done  by  the  GO  TO  statement  driver  in  the 
supervisor.   The  DEPENDING  ON  option  is  implemented  using  an 
array,  constructed  during  the  interpretation  of  the  GO  TO, 
to  save  the  symbol  table  pointer  of  each  possible  des- 
tination which  is  then  indexed  by  the  value  of  the  named 
variable.   This  array  cannot  be  replaced  by  the  simple  tech- 
nique of  using  the  index  value  to  locate  the  intermediate 
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text  entry  of  the  destination/  because  one  or  more  paragraph 
names  might  be  continued  between  lines,  causing  an  extra 
entry  in  the  intermediate  text  in  addition  to  the  blank 
token  between  each  paragraph  name. 

4.2.5  Other  statements 

The  OPEN  and  CLOSE  statements  simply  set  a  value  in 
the  symbol  table  entry  of  the  specified  file  to  indicate  the 
file's  status.   An  illegal  file  status  in  any  of  the  input- 
output  statements  results  in  an  error  message.   CLOSE  also 
writes  any  partially  full  block  onto  the  screen. 

The  ADD,  SUBTRACT,  MULTIPLY  and  MOVE  require  very 
little  effort  in  the  supervisor  since  their  actual  execution 
is  done  by  the  machine.   The  statement  driver  routines  in  the 
supervisor  call  the  appropriate  subroutines  needed 
to  organize  a  machine  "instruction,"  then  call  the  machine 
with  the  assembled  instruction. 

The  STOP  statement  terminates  execution  of  the  pro- 
gram.  Any  symbol  table  entries  that  were  changed  during  exe- 
cution are  reset  to  their  initial  values.   The  student  can 
choose  to  see  a  page  of  execution  statistics,  including  such 
items  as  how  much  CPU  time  and  how  much  storage  for  data 
items  was  used. 

4.2.6  Unit  "address" 

One  unit  of  TUTOR  code  in  the  supervisor  is  so 
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central  to  the  interpreter  that  it  deserves  a  detailed  dis- 
cussion.  Unit  "address"  prepares  the  operand  list  entries 
for  use  by  the  machine  from  symbol  table  entries  and  sub- 
script values.   Its  importance  is  that  it  organizes  informa- 
tion about  an  operand  into  an  efficient  form  for  use  by  the 
machine. 

Its  most  complex  function  is  to  determine  the  off- 
set of  each  item  from  the  start  of  the  area  representing  the 
core  of  the  COBOL  machine.   Because  the  character  table  and 
the  core  both  contain  seven-bit  characters,  the  character 
table  can  be  used  as  if  it  were  a  part  of  core.   This  is 
used  for  constants  so  that,  to  the  machine,  constants  and 
variables  look  almost  alike.   No  core  space  is  wasted  dupli- 
cating the  character  table  and  no  extra  moves  need  to  be 
made.   The  offset  and  length  fields  for  constants  in  the 
operand  list  are  set  to  exclude  the  quotes  surrounding  char- 
acter strings  and  to  exclude  signs  and  trailing  periods  in 
numeric  constants. 

To  determine  the  effective  offset  of  a  data  item 
three  values  are  added  together:   the  base  offset  contained 
in  the  item's  symbol  table  entry,  the  displacement  of  the 
current  record  from  the  first  record  of  the  block  and  the  dis- 
placement of  this  occurrence  of  the  item  from  the  first  oc- 
currence of  the  item.   The  base  offset  is  the  location  of  the 
location  of  the  leftmost  character  of  the  first  (that  occur- 
rence with  a  subscript  of  one)  or  only  occurrence  of  the  item 
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in  the  first  record  of  a  block.   By  using  a  pointer  in  the 
data  item's  symbol  table  entry,  the  symbol  table  entry  of 
the  appropriate  file  is  reached;  an  entry  in  the  file's 
symbol  table  entry  contains  the  record  displacement  (if  the 
last  record  read  in  an  input  file  was  the  ith  in  the  block 
and  the  record  length  is  I   then  the  record  displacement  is 
£(i-l)).   Items  in  WORKING-STORAGE  always  have  a  zero  record 
displacement.   If  the  item  has  a  subscript,  the  third  value 
is  determined  by  first  obtaining  the  value  of  the  subscript 
by  recursively  calling  "address"  and  calling  a  routine  in 
the  machine.   An  field  in  the  item's  symbol  table  entry 
points  to  the  item  with  the  OCCURS  clause;  the  length  of  the 
latter  item  and  the  value  of  the  subscript  determine  the 
occurrence  displacement  (if  £  is  the  length  and  i  the  sub- 
script value,  the  occurrence  displacement  is  £(i-l). 

The  other  fields  of  an  operand  list  entry  are  re- 
formatted symbol  table  information  or  are  initialized  to 
zero.   One  exception  applies  to  numeric  constants.   In  order 
to  avoid  repeated  decimal- to-binary  conversions,  the  con- 
stant's binary  value  is  determined  once,  then  saved  in  a 
symbol  table  field  (if  it  can  be  expressed  in  12  unsigned 
bits) .   This  value  is  copied  by  "address"  into  the  value 
field  of  the  operand  list. 

4. 3  The  machine 

The  machine  operates  in  three  modes:   binary, 
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decimal  and  character.   The  mode  of  an  instruction  is  a  func- 
tion of  the  operation  code  and  the  types  of  two  operands. 
Because  all  arithmetic  operations  are  done  in  binary  mode, 
the  binary  mode  subroutine  is  called  directly  by  the  super- 
visor for  ADD,  MULTIPLY  and  SUBTRACT  statements.   For  expres- 
sion evaluation,  including  comparisons,  and  for  MOVE  state- 
ments a  three-dimensional  array,  initialized  at  the  start  of 
interpretation  and  referenced  upon  entry  into  the  machine, 
determines  the  mode.   The  subscripts  of  the  array  reference 
are  the  types  of  the  two  operands  and  the  type  of  operation 
code;  the  values  in  the  array  are  0,  1,  2  and  3  indicating 
illegal  cases,  character  mode,  decimal  mode  and  binary  mode, 
respectively. 

The  machine  has  an  instruction  set  of  12  commands: 

1  divide 

2  multiply 

3  add 

4  subtract 

5  complement  algebraically 

7  move 

8  store 

9  compare  for  less  than 

10  compare  for  equal  to 

11  compare  for  greater  than 

12  complement  logically 
15   test  for  numeric 
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In  addition  one  subroutine  of  the  machine  is  often  called 
directly  to  return  a  value  needed  immediately  by  the  super- 
visor (for  example  when  evaluating  a  subscript) . 

When  evaluating  an  expression  the  machine  executes 
each  instruction  in  the  queue  prepared  by  the  supervisor. 
As  mentioned  earlier  some  of  the  operand  list  pointers  may 
be  incorrect.   This  is  determined  by  checking  the  format  of 
the  operand  pointed  to.   If  it  is  null,  the  operand  list  is 
searched  sequentially  for  the  first  operand  whose  format  is 
not  null;  this  is  the  correct  operand.   For  involved  expres- 
sions requiring  many  intermediate  results  this  algorithm 
is  rather  inefficient.   However  complicated  expressions  should 
be  rare  in  most  COBOL  programs.   This  algorithm  does  permit 
a  convenient  scheme  for  allocating  intermediate  results. 

Although  two  operands  are  used  to  determine  the 
mode,  the  machine  executes  instructions  using  zero,  one,  two 
and  three  operands  depending  on  the  instruction.   For  example, 
logical  negation  has  no  explicit  operand  (it  uses  a  reserved 
variable)  while  addition  has  three.   The  sum  of  the  first 
two  operands  is  stored  in  the  third.   If  zero  is  specified  as 
the  third  operand,  the  machine  is  to  establish  an  operand 
list  entry  for  an  intermediate  result;  it  will  be  in  one  of 
the  slots  vacated  by  erasing  the  first  two  operands  from 
the  list.   This  arrangement  proved  to  be  very  flexible;  it 
was  developed  after  investigation  showed  that  a  fixed  number 

of  operands  would  be  cumbersome. 
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4.3.1   Binary  mode 

Binary  mode  is  by  far  the  most  complex  of  the  three 
modes.   It  includes  all  arithmetic  and  editing  operations  as 
well  as  moves  and  compares  involving  binary  operands. 

The  central  problems  faced  by  the  binary  mode  arith- 
metic units  have  been  clearly  stated  by  Lippit  [6]: 

(1)  How  can  we  achieve  the  same  arithmetic 
result  on  a  nondecimal  machine  as  on  a 
decimal  machine? 

(2)  Regardless  of  machine  type,  what  are  the 
rules  to  be  followed  in  performing  the 
arithmetic  operation? 

Lippit' s  solution  to  the  first  problem  is  used: 
simulate  decimal  arithmetic  keeping  track  of  decimal  point 
alignment.   To  do  this  each  value  is  represented  as  integer- 
valued  floating-point  number  and  a  power  of  ten.   This  avoids 
the  representation  errors  in  noninteger  floating-point  num- 
bers which  do  not  occur  on  a  decimal  machine. 

The  rules  to  be  followed  are  partially  described 
by  COBOL  options.   The  ROUNDED  option  requires  that  results  be 

rounded,  but  because  this  option  is  not  available  in  the 
subset  results  instead  must  always  be  chopped.   Although  the 
SIZE  ERROR  option  is  not  part  of  the  subset's  syntax,  it  is 
implemented  for  all  arithmetic  operations  as  an  aid  to  the 
student  in  debugging  his  programs.   Checks  for  overflows 
causing  loss  of  significance  are  made.   If  an  overflow  occurs, 
an  error  message  is  displayed  so  that  the  student  may  enter 
a  smaller  value. 
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The  number  of  integer  and  decimal  places  to  be 
carried  in  intermediate  results  is  not  specified  by  ANSI. 
The  rules  used  for  intermediate  results  are  essentially 
those  given  for  IBM  360  COBOL  [3] .   These  rules  specify  how 
to  carry  the  maximum  number  of  significant  digits  in  each 
intermediate  result  based  on  the  number  of  integer  and  deci- 
mal places  of  each  operand  and  on  the  maximum  number  of  deci- 
mal places  of  any  operand.   The  only  modification  made  was 
to  reduce  the  maximum  number  of  significant  decimal  digits 
from  30  to  14  in  order  to  fit  any  value  into  one  CYBER  73 
word.   This  is  a  mild  restriction  since  14  should  normally 
be  ample. 

Binary  mode  must  convert  all  decimal  operands  to 
binary,  perform  its  computations  in  binary,  then  convert  the 
result  to  decimal  if  necessary.   The  binary-to-decimal  con- 
version is  aided  by  a  TUTOR  command  for  this  function,  but 
no  decimal-to-binary  conversion  is  available.   The  decimal- 
to-binary  routine  verifies  that  each  digit  position  does  in 

fact  contain  a  digit.   If  not  an  error  routine  displays  the 
illegal  character  and  accepts  a  correction. 

Also  included  in  binary  mode  is  the  editing  of 
numeric  edited  fields.   This  includes  suppression  of  leading 
zeros,  replacement  of  leading  zeros  by  asterisks,  fixed  and 
floating  dollar  signs  and  minus  signs,  and  insertion  of 
periods,  commas,  blanks  and  zeros,  as  prescribed  by  picture 
strings.   Because  there  is  no  equivalent  instruction  in 
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TUTOR,  a  fairly  large  subroutine  is  required.   Information 
about  the  picture  stored  by  the  editor  in  the  picture's 
symbol  table  entry,  as  well  as  the  characters  of  the  picture 
string,  dictates  the  progress  of  this  subroutine. 

4.3.2  Decimal  mode 

The  move  and  compare  operations  are  executed  by 
the  decimal  mode  routine  in  a  direct  manner.   The  operations 
proceed  one  character  at  a  time  from  left  to  right  in  the 
decimal  operands.   Decimal  point  alignment  is  accomplished 
by  appropriate  initialization  of  the  pointers  to  characters 
within  the  operands.   As  in  the  decimal-to-binary  conversion 
each  digit  position  must  contain  a  digit  or  else  an  error 
occurs. 

4.3.3  Character  mode 

This  is  the  least  complicated  of  the  machine's 
three  modes.   Its  operations  are  moves,  comparisons  and 
numeric  tests.   They  proceed  left  to  right  one  character  at 
a   time,  with  truncation  and  padding  with  spaces  in  moves 
and  with  extension  of  the  shorter  operand  with  spaces  in 
comparisons.   No  error  conditions  are  possible. 
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5.   DOCUMENTATION  PROVISIONS 

As  with  almost  any  system  of  programs ,  the  PLATO 
COBOL  system  will  probably  require  changes  in  the  future. 
Therefore  documentation  is  important  as  a  source  of 
guidance  to  someone  who  needs  to  understand  and  change  the 
COBOL  tables  or  lessons.   Hopefully  a  person  with  the  proper 
background  who  carefully  studies  the  documentation  will 
acquire  a  thorough  understanding  of  the  COBOL  system. 

Changes  may  be  needed  for  many  reasons.   Both 
PLATO  and  the  compiler-writing  system  are  relatively  new 
and  may  be  changed  considerably.   Changes  in  either  might 
require  changes  in  the  COBOL  lessons  or  might  permit  improve- 
ments in  them.   Other  features  might  be  added  to  the  COBOL 
subset,  although  currently  space  restrictions  prevent  this. 
It  is  possible  that  there  are  undetected  errors  in  the  system 
which  need  to  be  fixed,  although  the  system  has  been 
thoroughly  tested. 

In  order  to  fully  understand  the  COBOL  system,  a 
person  needs  to  have  a  solid  background  in  several  areas. 
Of  course  he  needs  to  understand  the  principles  of  compiler 
construction  and  of  the  compiler-writing  system  and  he  needs 
a  good  knowledge  of  the  TUTOR  language.   In  order  to  under- 
stand the  rules  of  COBOL  that  are  imposed  upon  the  student's 
program  and  obeyed  by  the  interpreter,  he  needs  a  detailed 
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knowledge  of  COBOL,  or  at  least  a  fair  knowledge  plus  a 
handy  reference  manual. 

Documentation  separate  from  the  programs  it 
documents  can  easily  become  obsolete  as  changes  are  made  to 
the  programs.   To  reduce  this  problem  most  of  the  documentation 
is  included  with  the  programs,  accessible  to  a  future 
author  through  his  PLATO  terminal.   This  should  encourage  the 
future  author  to  change  the  documentation  whenever  he  change 
a  program.   The  only  exception  is  this  thesis,  which  is  an 
introductory  guide  to  the  structure  of  the  COBOL  system  and 
to  the  major  problems  encountered  in  it.   The  thesis  is  more 
general  than  the  other  documentation  and  less  subject  to 
change . 

Two  sources  provide  information  about  the  COBOL 
subset.   One  description  of  the  subset,  which  uses  standard 
COBOL  notation  and  explains  many  features  of  the  COBOL 
system,  is  available  to  both  students  and  authors  in  lesson 
"trw2"  and  is  included  in  Appendix  A.   Another  description 
using  an  extended  BNF  notation  is  included  within  lesson 
"cobolcomp."   Also  in  "cobolcomp"  are  a  guide  to  all  docu- 
mentation, an  explanation  of  how  fields  of  the  symbol  table 
are  used  for  different  syntactic  classes,  and  an  explanation 
of  the  modifications  made  to  accommodate  two  sets  of  lexical 
tables. 
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The  tables  of  the  system  are  maintained  by  aux- 
iliary programs  of  the  compiler-writing  system  and  may  be 
easily  inspected.   Most  of  the  COBOL  tables,  given  in 
Appendix  B,  can  be  understood  just  by  studying  their  entries. 
The  lexical  node  table  is  an  exception;  it  is  very  difficult 
to  follow  the  lexical  recognition  of  a  token  by  inspecting 
the  table.   To  clearly  display  how  the  node  table  constructs 
tokens,  a  short  TUTOR  lesson  to  generate  transition  diagrams 
based  on  the  node  tables  is  available;  a  sample  is  contained  in 
Appendix  D.l. 

The  most  detailed  documentation  is  provided  by 
extensive  comments  included  within  the  programs  ( "cobolcomp, " 
the  syntax  program,  and  "cobolrun").   The  comments  explain 
the  function  of  variables  and  tell  the  purpose  of  program 
units  and  of  the  more  obscure  statements.   However  since  the 
comments  presume  a  knowledge  of  the  more  general  documenta- 
tion, the  program  listings  should  not  be  the  first  docu- 
mentation studied. 

As  further  documentation  for  the  interpreter 
two  TUTOR  routines  provide  graphic  representations  on 
a  PLATO  screen.   One  is  a  set  of  diagrams  showing  how 
the  units  of  "cobolrun"  are  related  (Appendix  D.2);  the 
other  is  a  diagram  displaying  the  table  that  determines 
the  mode  of  the  machine  (Appendix  D. 3) . 
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6.   CONCLUSIONS 

The  implementation  of  the  interactive  COBOL  com- 
piler system  on  PLATO  has  demonstrated  the  general  suita- 
bility of  the  compiler-writing  system  to  COBOL.   Although 
we  have  seen  that  COBOL  is  not  ideally  suited  to  the  system, 
the  system  is  sufficiently  flexible  to  accommodate  almost 
all  requirements  of  COBOL. 

While  the  table-driven  system  is  available  for 
the  editor  part  of  the  compiler,  the  interpreter  part  is  a 
program  written  exclusively  for  COBOL  and  required  much  more 
time  than  the  editor.   An  extension  to  the  compiler- 
writing  system  to  include  some  form  of  language-independent 
interpretation  is  a  large  and  difficult  area  for  further 
research  that  could  significantly  reduce  the  time  needed  to 
implement  a  new  language  using  this  system. 

The  COBOL  system  has  also  demonstrated  the  restric- 
tions that  the  current  PLATO  system  places  on  large  lessons 
requiring  a  relatively  high  usage  of  processor  time.   PLATO 
memory  restrictions  are  the  primary  cause  of  the  size 
limitations  imposed  on  the  COBOL  subset  and  on  COBOL  pro- 
grams.  The  rate  of  CPU  usage  needed  by  the  COBOL  system 
exceeds  the  PLATO  average  (designed  for  lessons  with  many 
fewer  calculations)  and  causes  slower  response  times  than  nor- 
mally found  on  PLATO.   Because  the  COBOL  system  has  not  yet 
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been  used  by  large  numbers  of  students,  it  is  uncertain  to 
what  extent  these  problems  will  limit  the  system's  useful- 
ness. 
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APPENDIX  A 
COBOL  HELP  LESSON 

Appendices  A,  B,  C  and  D  consist  of  copies  of 
displays  seen  on  a  PLATO  screen.   This  appendix  is  the 
description  of  the  COBOL  subset  contained  in  lesson  "trw2" 
which  the  student  can  reference  while  writing  his  program 
in  the  editor. 
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COBOL  Subset  on  PLATO 

The  following  pages  give  a  description  of  the 
COBOL  subset  implemented  on  PLATO.  It  is  assumed  that 
you  have  some  knowledge  of  COBOL  fundamentals. 

Press :  for : 

1  General  Information 

2  Format  by  Columns 

3  Character  set 

4  Syntax  Notation 

5  DATA  DIVISION 

6  ;      Pr  i  nt  F  i  1  es 

7  PROCEDURE  DIVISION 

(help)      Explanation  of  Keys  Used 
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General  Information  about  PLATO-COBOL 


This  compiler  is  intended  for  use  by  beginning  students 
of  COBOL.  A  subset  of  COBOL  is  recogized  which  should  be 
sufficient  for  this  purpose. 

The  size  of  a  COBOL  program   is  restricted  to  the  size 
of  the  PLATO  screen.   Since  COBOL  programs  tend  to  be  large, 
this  is  a  significant  restriction.  In  order  to  minimize  the 
problem,   some  modifications   to  standard  COBOL   have  been 
made.  One  is  the  column  format,  as  given  on  the  next  page. 

A  second  change  is  that  the  IDENTIFICATION  DIVISION 
and  the  ENVIRONMENT  DIVISION  are  not  included.  You  begin 
your  program  by  entering  FD  in  columns  2  and  3. 

All  files  must  have  LABEL  RECORDS  OMITTED.   It  is 
assumed  that   it  has  been  ASSIGNed   to  the  PLATO  terminal. 
This  means  that  all  input  must  be  supplied  by  the  student 
through  the  keyboard,   while  output  will  be  displayed  on  th< 
screen . 
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Explanation  of  Symbols  Used  in  Syntax  Descriptions: 


CRPS    Required  keyword 

CRPS    Opt  i  ona 1  word 

small    User-suppl ied  word  or  literal 


i 

{  } 
[  J 


Choice  to  be  made 

Opt  i  ona 1  phrase 

Optional  repetition  of  previous  entry 
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DATA  DIVISION 


Press. . .  for. . . 

1  File  Description  Entries 

2  Record  Descr  i  pt  i  on  Ent  r  i  es 

3  Alphanumeric  Data- items 

4  Numeric-edited  Data- items 

5  Numeric  Data- items 

6  Ot  her  i  n  format  i  on 

Mb)  to  get  back  to  this  page 
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File  Descriptions 


FD      file-name 


L6BEL  (BECORQ  IS  1  quitted 

\beqqbqs  are/ 

TbLOCK  CONTAINS  integer  /CHARACTERS!"] 
L  L  RECORDS  Jj 


("RECORD  CONTAINS  integer  CHARACTERS"]  . 
only  fixed  length  records  are  implemented 


52 


Record  Description  Entries 


level -number 


item-namel 
EILLEB  J 


BEQEEINEjS    item-namel 


OCCURS  integer  TIMES 


VRLUE  IS  literal 


] 


ruSRGE  IS  (   BISPLfiY   T 

\COMPUTRTIONRLJ_ 


PIC  picture -string 


on 1 y  1  1 eve 1  o f  subscr ipting  is  all owed 

see  next  topic  for  details  of  acceptable  pictures 

level  number  may  not  be  6  6  or  8  8 


To  insure  that  there  will  be  enough  space  for  your 
input,  make  sure  that  no  elementary  item  in  an  input  file 
is  larger  than  7  5  characters! ! ! 
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Alphanumeric  Data- items 


An  alphanumeric  data- item  (AN)  is  an  item  whose 
picture  contains  only  the  letter  X.  The  repetition 
factor  may  be  used;  this  means  that  XXXXX  and  X  (5) 
both  are  valid  pictures  for  a  field  that  may 
contain  any  5  characters. 
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Numeric-Edited  data- items  (NED 


R  numeric-edited  data- item  (NE)  is  one  whose  picture 
contains  at  least  one  of  the  following:  S  -  *  Z  .  , 
0  and  B.  It  may  also  cor  tain:  9  and  V.  The  symbols 
>  P  CR  and  DB  which  are  part  of  standard  COBOL 
numeric-edited  pictures  are  not  available  on  PLRTO. 
The  minus  sign  -  may  not  appear  at  the  right  end 
of  the  picture. 
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Numeric  Data- items 


R  numeric  item  is  one  whose  picture  has  at  least 
one  9.    It  may  have  S  and  V.   P  is  not  available. 


R  numeric  item  is  assumed  to  be  in  external  decimal 
format  (ED)  unless  its  USAGE  is  COMPUTRTIONRL,  in  which 
case  its  format  is  binary  CBI) .  USRGE  COMP  in  PLRTO-COBOL 
may  only  be  declared  in  WORKING -STORAGE,  and  pictures  for 
COMP  items  must  have  an  S  and  8  9s  [for  example  S9  (8)  and 
S9(6)V9  9  are  valid]  .  The  SYNC  clause  may  not  be  specified, 
but  it  is  implicitly  in  effect  for  all  COMP  items;  each 
COMP  item  is  4  characters  long.  Using  REDEFINES  together 
with  COMP  is  strongly  discouraged. 


The  max i mum  number  of  digit  pos i t i ons  wh i ch  may  be 
specified   for  a  ED  item   is   14   (normally  18  digits  are 
a  1 1  owed)  . 
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Other  Info  on  DRTR  DIVISION 


Alphabetic  items  (pictures  with  R)   and  alphanumeric-edited 
items  (pictures  with  X  R  B  ,  0)  are  not  available. 
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OUTPUT  "PRINT"  FILES 


1.  All  output  will  be  displayed  on  your  PLATO  screen  in 
the  same  way  as  a  COBOL  file  is  written  to  the  printer.  The 
screen  has  3  2  1 i  nes ,  each  1 i  ne  has  6  4  pos  i  t  i  ons . 

2.  The  record  size  should  be   65  characters  or  less.   This 
must  include  1  position  (the  first)  for  a  compi ler -generated 
control  character.   Do  not  move  anything   to  this  position; 
it  won't  be  shown  if  you  do. 

3.  A  block  of  a  "print"  file  is  considered  by  this  compiler 
to  be  a  "page";  that  is,  the  amount  of  print  output  to  be 
displayed  on  the  screen  at  one  time.  This  means  that  if 
the  BLOCK  CONTAINS  clause  is  omitted  or  is  1  RECORDS,  the 
output  will  be  shown  one  line  at  a  time.  If,  for  example, 
BLOCK  CONTAINS  15  RECORDS,  no  output  will  be  shown  until 
the  15  WRITE.  Biecause  the  maximum  number  of  character 
that  can  be  stored  is  17  92,  the  absolute  maximum  number 

of  6  4  character  lines/page  is  2  7   (this  would  leave  only  3  7 
characters  for  all  other  data). 
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press. . .   for. . . 

1  ADD,  CLOSE,  COMPUTE,  EXIT,  GO,  IF,  MOVE,  MULTIPLY 

2  OPEN,  PERFORM,  READ,  STOP,  SUBTRACT,  URITE 

3  Legal  MOVEs 

4  Legal  Comparisons 

5  Types  of  Conditions  Available 

6  Other  Information 


Ms)   To  return  to  this  page 
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ADD  /data-item]  jq  data-item 

\  literal   J 


ADD  <Tc'a^a"  *  tern"!    fdata-iteml 
I  literal    J   \   literal    J 


GIVING  data- item 


CLOSE  f i le-name  .... 

COMPUTE  dat  a - i  t  em  =  ar i  t hmet  i  c - express  i  on 

EXIT 

Gy  TO  paragraph -name 

GO  JO  paragraph -name DEPENDING  ON  data- item 

IF  condition  ftEH  SENTENCE^  ^  fbJEXI  SENTENCE \ 


[statement .  .  .  ,  j         (statement  .■••/_ 


MOVE  fdata-iternl  T0  data-item 

(  literal  J 

MULTIPLY  /ciata-iteml  BY  data-item 
k  literal  J  "" 

MULTIPLY  /data- item  I  gy  Jdata-iteml  GIVING  data- it 
{    literal  J    \  literal  J 


em 
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OPEN  A^PUI  I  file-name... 
lOUIPLJIJ 

EEEEQEd  paragraph-name  IHRU  paragraph -name 

fdata-iteml  TIMEs"1 
\  integer  J 

PEREQBd  paragraph -name  IHRU  paragraph -name 

VARYING  data- item  ERQM  (data- i tern 1  gy  fdata-iteml 

1^  literal  J     ^  integer  J_ 

UNTIL  condition 
READ  file-name  AT  END  statement.... 
STOP  RUN 

SUBTRACT  fdata-iteml ppgfj  data-item 

{   literal  J 

SUBTRACT  /data- iteml FR0M  fdata-iteml  GJViNG  data-item 

^  literal  J       "   \  literal  J   " 

UJRITE  record-name  |BEF0RE  ADVANCING  integer  LINES  1 
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Information  about  Conditions 


The  only  types  of  tests  allowed  in  conditions  are  the 
relation  test  (<  -  >)  and  the  test  for  NUMERIC.  Other  class 
tests,  sign  tests  and  condition -name  tests  are  not  provided, 


Implied  subjects  or  relational  operators  (such  as 
"IF  ft  =  B  OR  C...")  are  not  allowed. 


Compound  cond  i  t  i  ons  may  be  formed  by  connect  i  ng  s  i  mp 1 e 
conditions  with  ftND  or  OR,  but  FIND  and  OR  cannot  be  used  in 
the  same  compound  condition.  NOT  cannot  be  used  as  a 
logical  operator,  but  may  be  used  in  relation  tests.'  This 
means  that  IF  ft  NOT  =  B  is  ok,  but  IF  NOT  ft  =  B  will  cause 
an  error  message. 


, 
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Other  Info  about  the  PROCEDURE  DIVISION 


The  ROUNDED  opt i on  is  not  ava i 1 ab 1 e ;  no  round i  ng  will 
occur.  Although  SIZE  ERROR  cannot  be  specified,  it  is  in 
e  f  feet  at  all  t  i  mes ;  you  will  get  warn  i  ngs  at  exect  i  on  t  i  me 
if  arithmetic  overflows  occur. 


Full  expressions  may  be  written  in  COMPUTE  statements 
and  in  conditions,  except  that  exponentiation  (**)  is  not 
provided. 


Each  sentence,   including  the  first  one,   must  be   in  a 
paragraph .   Sect  i  on  names  can ' t  be  used ,  however . 

Fit  the  end  of  your  program,  you  must  put  END  in 
area  A  (columns  2-5)  followed  by  a  space,  then  press  (stdp). 
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APPENDIX  B 
COBOL  TABLES  IN  THE  COMPILER-WRITING  SYSTEM 

All  COBOL  tables  used  by  the  language-independent 
editor  routines,  except  the  syntax  code  table,  are  included 
in  this  appendix.   The  displays  shown  here  are  not  seen  by 
the  student  writing  a  COBOL  program  but  by  the  COBOL  imple- 
mentor  when  constructing  or  modifying  the  tables. 
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B.l   Primary  tables 


Press. . .    To  change. . . 

a  Name  of  lar.ga.Lige  (COBOL) 

b  Name  of  character  set  (COBOL  in  trui2) 

c  Editor  leslist  index  <15>:  (cobolcomp) 

d  Interpreter  leslist  index  <20>:  (cobolrun) 

e  Internal  character  codes  (64  codes) 
f     .'  .   Blank  code  (1)  : 

g  Default  tabs 

h  Key  Fu.nct  i  on  Tab  1  es 

i  Lexical  Class  R,=.signments  (20  classes)1 

"  j  Lexical  Node  Table  (30  nodes) 

k  Safe  Nodes1 

1  Prede f i ned  Sy  mbo Is  (110  names ,  565  char act ers) 

m  Field  Symbols 

n  New  Symbol  Allocation  Classes 


flSTaBflcfl  to  return  tables  to  -trw6 
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B.l.l      Internal   character   codes 


Internal    Character  Codes 

16-   /  32-   E  48-  U         64-    # 

1-  .         17-    , ;  33-   F  49-  V 

2-  $        18-   *  34-   G  50-  W 

3-  .  19-   _  35-    H  51-  X 

4-  <         20-    >  36-1  52-  Y 

5-  (        21-    ?  37-    J  53-  Z 

6-  +         22-    :  33-   K  54-  0 


7- 

1 

23- 

# 

39- 

L 

55- 

1 

8- 

& 

24- 

(3> 

40- 

M 

56- 

2 

9- 

l 

25- 

• 

41- 

N 

57- 

3 

10- 

$ 

26- 

= 

42- 

0 

53- 

4 

11- 

* 

27- 

u 

43- 

P 

59- 

5 

12- 

) 

28- 

R 

44- 

Q 

60- 

6 

13- 

• 

29- 

B 

45- 

R 

61- 

7 

14- 

-i 

30- 

C 

46- 

S 

62- 

8 

15- 

- 

31- 

D 

47- 

T 

63- 

9 
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B.1.2       Initial    tab    settings 


vO 
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n- 

CO 
CM 
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CO  CO 

CO  CM 

CO  -^ 

CO  CSJ 
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CM  CO 
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cm   in 
cm  t 

CM  CO 

CM  CM 

CM  — • 

CM  Sj 

— •  ov 

*-+  CO 
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t- i    vO 

•^  ir» 

•^    CO 

~*    CM 

»■*     »-• 
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CO 
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ir» 
«T 
CO 
CM 
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B.1.3   Key  function  tables 


Press. . .    For. . . 

1  Lower  case  keyboard  ■ 

2  Upper  case  keyboard 

3  Function  keys 

4  Lower  case  access  keyboard 

5  Upper  case  access  keyboard 

6  Recess  f unct  i  on  keys 


Press  frfiCKj  to  return  to  index  page 

Press  (help)  to  locate  a  table  by  key 

Press  EffJUHELP]   for    1  i st   of   spec i a  1    f unct i on  codes 

I    • 
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-1  Ignore  key 

-2  Move  cursor  1  character  to  right. 

-3  Move  cursor  right  one  token 

.-4  Move  cursor  right  one  line 

-5  Erase  character  to  right 

-6  Move  cursor  left  one  character 

-7  Move  cursor  left  one  token 

-8  Move  cursor  left  one  line 

-9  Erase  token  to  right 

—  1 JBT  Erase  character  to  left  of  cursor 
-11  Erase  token  to  left  of  cursor 

-12  Erase  line  to  left  of  cursor 

-13  Erase  line  to  right 

-14  Recess  alternate  key  set 

- 1 5  Tab 

- 1 6  Open  " gap "  one  1 i  ne . 

-17  Clear  screen 

-18  Return  to  index  page 

- 1 9  CI ose  " gap "  by  one  1 i ne 

-20  Help  sequence  1 

- 2 1  He  1 p  sequence  2 

-22  Switch' lesson  to  backgound 

-  2  3  Cornp  i  1  er  debugg  i  ng  a  i  ds 
-24  Move  cursor  to  end  of  text 
-25  Unassigned 

-26  Switch  lesson  to  foreground 

-2  7  Set  tab 

-28  Clear  tab 

- 2  9  Repr  i  nt  screen 

-30  Move  cursor  t^  next. line 
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B.1.4   Lexical  character  classes 
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B.1.5      Lexical    node    table 


Each  table  entry  has  the   form:     (G=optionaU 


number 


E    . 

Fn 

N    [I] 

FP 

X    [I] 

FB  ' 

P 

FE 

R 

FD 

The  letter  codes  have  the  fallowing  meaning: 


Letter  Meaning. ... 

E  Lex  i  ca 1  error 

N  Non-extendible  token 

X  extendible  token 

I  Ignore  token  (do  not 

send  to  parser) 

P  Procedure  cal 1  ' 

R  Return  from  procdure 

(blank)  Accept  character 


Interpretation  of  number 
error  message  number 
sy  nt  act  i  c  c 1 ass 
sy nt act  i  c  c 1 ass 

t  •  ;       N.  ft. 

first  node  of  procedure 
return  code 
next  node 


Fn  Generate  field  #n(l,2,  or  3) 

FP  Field  starts  with  next  column 

FB  Field  begins  with  current  col urn 

FE  Field  ends  with  current  column 

FD  Field  ends  with  previous  column 


>Cont  i  nuat  i  on  field 


Press  [hekt]  to  inspect  the  node  table. 
Press  [back]  to  return  to  the  index  page. 
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B.l.  8   Field  token: 


Fields  are  special  predefined,  non-printable  tokens 
which  may  be  used  to  pass  position  information  to  syna: 


Field 

1 
2 
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Name  Table  Index 

8 

23 

0 


Press  '  1  '  ,  ' 2 ' ,  or  ' 3 '  to  change  any  of  these 
Press  (back]  to  return  to  index  page.    i 


B.1.9   New  symbol  allocation  classes 
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B.2   Picture  tables 


Press. . .    To  change. . .  !  • 

a  t        Name  of  langauge  (COBOL) 

b  ;    Name  of  character  set  (COBOL  in  trw2) 
c        Editor  leslist  index  <15>:  (cobolcomp) 
d        Interpreter  leslist  index  <3>:  (trw2) 
e        Internal  character  codes  (6  4  codes) 
f        Blank  code  (1) 
g        Default  tabs 
h        Key  Funct  i  on  Tab 1 es 

i        Lexical  Class  Assignments  (20  classes), 
-  j        Lexical  Node  Table  (31  nodes) 
k        Safe  Nodes 1 

1        Predefined  Symbols  (109  names,  574  characters) 
m        Field  Symbols 
n        New  Symbol  Al location  Classes 


flSi&BftCK]  to  return  tables  to  -trw6- 


B.2.1   Lexical  character  classes 


94 


VJ 

<u 

n 

£ 

X 

<u 

2: 

1     1 

(0 

<U-  -• 

U) 

c 

<d 

0 

c 

I     I 


<D 

<o 

(A 
'T 
CO 
CM 

—    ~      I     » 
I        I        I        I 


ID 
#  N     -W  os  > 
1      1     1      1      1      1 


CJ 

U. 
Ld 
Q 
CJ 


1 


I 


I 
\ 

r 

Q_  «3 

<E  — 

■■>  \f 

+  -e- 
1    1 


NnTlA'ONODITilSJ^NnflAvDN 


.     CM  in  (J 

U  co  «t» 

(J  V  CO 

ca  O  ^o 

cm  <■  CJ 

(J  CO  CD 

CJ  t"  cm 

OS  (J  vO 

—  CO  (J 

(J  co  r>- 

U  T  - 

CO  (J  \D 

—•cm  CJ 

(J  CO  vO 

U  ^r  csj 

r^  CJ  so 

—  -  U 
(J  co  in 

0  ^r  o 

\D  (J  m 

hS!  CJ 

(J  co  *■ 

CJ  "T  CO 

in  (J  in 

—  CJv  U 
CJ  CM  CO 

(J  t  fs 

-r  (J  in 

H   00  (J 

(J  CM  C\l 

(J  "T  iD 

CO  (J  in 

—  f^-  (J 

U  M  ^ 

(j  «r  in 

cm  (j  m 

,-H  lO  (J 

CJ  CM  £j 

vO  - .  CJ  "^"  "^ 

°>Cr   U  -*  CJ  m 

—  in  CJ 
>-    in  CJ  cm  o> 

CJ    CJ  co  co 

3      *a  u  m 

T   —  t  CJ 

3    CJ  CJ  CM  CD 

CJ  00  CM 

F—    co  os  .  CJ  in 

CJ  CJ  co  CJ 

CM  CO  CJ  CO  *-• 

O"   CJ  CJ  CJ  in 

cm  CJ 

—  r>.  cm  vo 

CJ  CJ  U  co  eg 

CJ  in  ^r 

•  •  1  ... 

co  os  15 
—  «-•  CM 


(J 


95 
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APPENDIX  C 
OTHER  DOCUMENTATION 

Supplementary  documentation  for  the  COBOL  system, 
produced  by  auxiliary  programs,  is  included  in  this 
appendix.   Only  a  sample  of  the  transition  diagrams  is 
included,  since  they  include  the  same  information  as 
presented  in  the  lexical  node  tables. 
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C.l   Transition  diagram  sample 


• 

Trans  it  1  on    D  1  agram  •. 
Generator 


To  display  lexical  transition  diagrams  for  use' with 
tables  of  the  Computer  Assisted  Programming  System. 
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Explanation  of  Symbols 

1 ex  i  ca 1  node  number 
procedure  ca 1 1  • 

error  message  number 
syntactic  class  of  an  accepted  token 
lexical  character  class  returned  by  subroutine 
{  }    character  class  number,  then  first  5  class  members 


N  Non-extendible  token 

X  extendible  token 

I  Ignore  token  (do  not  send  to  parser) 

Fr\  Generate  field  *n  (1,2,  or  3) 

FP  Field  starts  with  next  column 

FB  Field  begins  with  current  column 

FE  Field  ends  with  current  column  . 

FD  Field  ends  with  previous  column 

Cn  Column  *n 


103 


Node    0:        ^  v  w 


[2}     ( 


{3} 


{i^} 


(12)     9 


{M}     . 


{15}    0 


{13} 


{20}    C.7 


{1}    X 


{4}     1    ?    3    4    5 
{5}    ) 


{6}     - 


Y-p 


1 IX 


22 


© 


1  IX 


'22 


<7) 


./'v 


x 


^ 

/N 


?P 


7}    $ 


18)     » 


{9}    Z 


{11}    S 


{13}    V 


{16}     +    ;    ft   P 


{17}    $    <     I    &    ! 


{19}    Cl 


b* 


& 


'3  2 


•# 


^ 


104 


C.2   Flowchart  of  "cobolrun" 


"cobo Irian"  ■  flowchart 


Except  for  the  units  of  "trw"  used  by  cobolrun, 
each  unit  of  "cobolrun"  appears  on  at  least  one 
page  of  this  lesson.  If  it  appears  on  more  than  1 
page,  its  "home"  page  (the  one  you  get  by  typing 
TERM  unit -name)  is  the  highest  numbered -page  that  - 
it  appears  on. 


The  flowcharts  are  arranged  by  pages  according 
to  their  cal 1 ing  sequences.   This  means  that  the 
called  unit  never  appears. on  a  lower -numbered  page 
than  the  calling  unit  (a  "call"  may  be  a  TUTOR  do, 
join  or  goto)  .  , 
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C. 3   Mode  table  diagram 


128 
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APPENDIX  D 
SAMPLE  COBOL  PROGRAM 

In  this  appendix  several  displays  that  might  be 
seen  by  the  student  as  he  uses  the  COBOL  system  are  shown, 
This  is  far  from  a  complete  set;  the  student  may  also  see 
error  messages,  "help"  information  in  the  interpreter  and 
the  information  contained  in  appendix  A. 
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YOUR  WORKSPACE  IS  EMPTY 


Press. . 


To.  .  . 


ILvOTI 


(edit) 

EH 

[PflTfl] 


Write  a  new  COBOL  program. 
Write  in  another  language. 
Edit  one  of  you?"  old  programs. 
Erase  one  of  your  programs. 
Execute  one  of  your  programs. 
See  a  1 i st  of  programs  on  file 
Return  to  the  lesson  guide. 
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WORKSPACE  CONTAINS  TEXT 
(Press  (DftTft)  for  details) 


Press. . .       To. . . 

hEXT)  Edit  your  workspace  some  more. 

(lhe)  Execute  your  workspace  as  a  program. 


G^^ickJ  Clear  your  workspace. 

(copy)  Copy  your  workspace    into  the    file 

flJ?:Tfropy]  Replace    '  cobo.lk'    with  workspace. 

(errse)  Erase  ,file    'cobolk'    from  the    file 
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COBOL  EXECUTION  STATISTICS 


Variable  Storage  Used: 

%   of  Storage  Used: 

Statements  Executed: 

i 

Lines  Out put ted: 


Tot  a 1  Execut  ion  Time: 


76  characters 

4  %     . 

■ 

22 

1 

0.9  47  seconds 


Average  T  i  me/st  at  ement : 


0.0430  seconds 


Average  Statements/second: 


23.2 
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APPENDIX  E 
PROGRAM  LISTINGS 


In  order  to  reduce  the  size  of  the  report  version  of  this 
thesis,  the  program  listings  included  in  Appendix  E  of  the  original 
thesis  have  been  excluded. 
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